239 research outputs found
ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning
In recent times, learning-based methods for video deraining have demonstrated
commendable results. However, there are two critical challenges that these
methods are yet to address: exploiting temporal correlations among adjacent
frames and ensuring adaptability to unknown real-world scenarios. To overcome
these challenges, we explore video deraining from a paradigm design perspective
to learning strategy construction. Specifically, we propose a new computational
paradigm, Alignment-Shift-Fusion Network (ASF-Net), which incorporates a
temporal shift module. This module is novel to this field and provides deeper
exploration of temporal information by facilitating the exchange of
channel-level information within the feature space. To fully discharge the
model's characterization capability, we further construct a LArge-scale RAiny
video dataset (LARA) which also supports the development of this community. On
the basis of the newly-constructed dataset, we explore the parameters learning
process by developing an innovative re-degraded learning strategy. This
strategy bridges the gap between synthetic and real-world scenes, resulting in
stronger scene adaptability. Our proposed approach exhibits superior
performance in three benchmarks and compelling visual quality in real-world
scenarios, underscoring its efficacy. The code is available at
https://github.com/vis-opt-group/ASF-Net
LATFormer: Locality-Aware Point-View Fusion Transformer for 3D Shape Recognition
Recently, 3D shape understanding has achieved significant progress due to the
advances of deep learning models on various data formats like images, voxels,
and point clouds. Among them, point clouds and multi-view images are two
complementary modalities of 3D objects and learning representations by fusing
both of them has been proven to be fairly effective. While prior works
typically focus on exploiting global features of the two modalities, herein we
argue that more discriminative features can be derived by modeling ``where to
fuse''. To investigate this, we propose a novel Locality-Aware Point-View
Fusion Transformer (LATFormer) for 3D shape retrieval and classification. The
core component of LATFormer is a module named Locality-Aware Fusion (LAF) which
integrates the local features of correlated regions across the two modalities
based on the co-occurrence scores. We further propose to filter out scores with
low values to obtain salient local co-occurring regions, which reduces
redundancy for the fusion process. In our LATFormer, we utilize the LAF module
to fuse the multi-scale features of the two modalities both bidirectionally and
hierarchically to obtain more informative features. Comprehensive experiments
on four popular 3D shape benchmarks covering 3D object retrieval and
classification validate its effectiveness
A new method using deep transfer learning on ECG to predict the response to cardiac resynchronization therapy
Background: Cardiac resynchronization therapy (CRT) has emerged as an
effective treatment for heart failure patients with electrical dyssynchrony.
However, accurately predicting which patients will respond to CRT remains a
challenge. This study explores the application of deep transfer learning
techniques to train a predictive model for CRT response. Methods: In this
study, the short-time Fourier transform (STFT) technique was employed to
transform ECG signals into two-dimensional images. A transfer learning approach
was then applied on the MIT-BIT ECG database to pre-train a convolutional
neural network (CNN) model. The model was fine-tuned to extract relevant
features from the ECG images, and then tested on our dataset of CRT patients to
predict their response. Results: Seventy-one CRT patients were enrolled in this
study. The transfer learning model achieved an accuracy of 72% in
distinguishing responders from non-responders in the local dataset.
Furthermore, the model showed good sensitivity (0.78) and specificity (0.79) in
identifying CRT responders. The performance of our model outperformed clinic
guidelines and traditional machine learning approaches. Conclusion: The
utilization of ECG images as input and leveraging the power of transfer
learning allows for improved accuracy in identifying CRT responders. This
approach offers potential for enhancing patient selection and improving
outcomes of CRT
- …